aggregation error
Resource-Element Energy Difference for Noncoherent Over-the-Air Federated Learning
Over-the-air federated learning (OTA-FL) reduces uplink latency by aggregating client updates directly over the wireless multiple-access channel. Coherent analog aggregation realizes this idea by aligning the phases and amplitudes of simultaneously transmitted waveforms, which typically requires synchronization, instantaneous channel-state information (CSI), phase compensation, and power control. Noncoherent energy detection removes the need for phase-coherent combining, but a single energy measurement is nonnegative and, therefore, cannot represent signed model updates. This paper introduces resource-element energy difference (REED), a noncoherent physical-layer primitive for continuous signed aggregation. REED maps the positive and negative parts of each real-valued update to transmit energies on paired orthogonal resource elements and estimates the signed sum by subtracting the corresponding received energies. The construction uses slow-timescale calibration of average channel powers, but does not require instantaneous transmitter- or receiver-side CSI or channel inversion. For independent Rayleigh fading, we derive exact first- and second-moment expressions for single-shot REED and for a chip-diverse extension that spreads each coordinate over multiple independently faded paired chips. The resulting variance laws separate fading-induced self-noise, signal-noise interaction, and receiver-noise fluctuation, giving an explicit diversity-resource tradeoff. More->The rest of abstract is in the paper.
Trade-off in Estimating the Number of Byzantine Clients in Federated Learning
Chen, Ziyi, Zhang, Su, Huang, Heng
Federated learning has attracted increasing attention at recent large-scale optimization and machine learning research and applications, but is also vulnerable to Byzantine clients that can send any erroneous signals. Robust aggregators are commonly used to resist Byzantine clients. This usually requires to estimate the unknown number $f$ of Byzantine clients, and thus accordingly select the aggregators with proper degree of robustness (i.e., the maximum number $\hat{f}$ of Byzantine clients allowed by the aggregator). Such an estimation should have important effect on the performance, which has not been systematically studied to our knowledge. This work will fill in the gap by theoretically analyzing the worst-case error of aggregators as well as its induced federated learning algorithm for any cases of $\hat{f}$ and $f$. Specifically, we will show that underestimation ($\hat{f}
Age-Based Device Selection and Transmit Power Optimization in Over-the-Air Federated Learning
Liu, Jingyuan, Chang, Zheng, Liang, Ying-Chang
Recently, over-the-air federated learning (FL) has attracted significant attention for its ability to enhance communication efficiency. However, the performance of over-the-air FL is often constrained by device selection strategies and signal aggregation errors. In particular, neglecting straggler devices in FL can lead to a decline in the fairness of model updates and amplify the global model's bias toward certain devices' data, ultimately impacting the overall system performance. To address this issue, we propose a joint device selection and transmit power optimization framework that ensures the appropriate participation of straggler devices, maintains efficient training performance, and guarantees timely updates. First, we conduct a theoretical analysis to quantify the convergence upper bound of over-the-air FL under age-of-information (AoI)-based device selection. Our analysis further reveals that both the number of selected devices and the signal aggregation errors significantly influence the convergence upper bound. To minimize the expected weighted sum peak age of information, we calculate device priorities for each communication round using Lyapunov optimization and select the highest-priority devices via a greedy algorithm. Then, we formulate and solve a transmit power and normalizing factor optimization problem for selected devices to minimize the time-average mean squared error (MSE). Experimental results demonstrate that our proposed method offers two significant advantages: (1) it reduces MSE and improves model performance compared to baseline methods, and (2) it strikes a balance between fairness and training efficiency while maintaining satisfactory timeliness, ensuring stable model performance.
Accelerating Monte Carlo Tree Search with Probability Tree State Abstraction
Fu, Yangqing, Sun, Ming, Nie, Buqing, Gao, Yue
Monte Carlo Tree Search (MCTS) algorithms such as AlphaGo and MuZero have achieved superhuman performance in many challenging tasks. However, the computational complexity of MCTS-based algorithms is influenced by the size of the search space. To address this issue, we propose a novel probability tree state abstraction (PTSA) algorithm to improve the search efficiency of MCTS. A general tree state abstraction with path transitivity is defined. In addition, the probability tree state abstraction is proposed for fewer mistakes during the aggregation step. Furthermore, the theoretical guarantees of the transitivity and aggregation error bound are justified. To evaluate the effectiveness of the PTSA algorithm, we integrate it with state-of-the-art MCTS-based algorithms, such as Sampled MuZero and Gumbel MuZero. Experimental results on different tasks demonstrate that our method can accelerate the training process of state-of-the-art algorithms with 10% 45% search space reduction.
CSIT-Free Federated Edge Learning via Reconfigurable Intelligent Surface
Liu, Hang, Yuan, Xiaojun, Zhang, Ying-Jun Angela
We study over-the-air model aggregation in federated edge learning (FEEL) systems, where channel state information at the transmitters (CSIT) is assumed to be unavailable. We leverage the reconfigurable intelligent surface (RIS) technology to align the cascaded channel coefficients for CSIT-free model aggregation. We then develop a difference-of-convex algorithm for the resulting non-convex optimization. Numerical experiments on image classification show that the proposed method is able to achieve a similar learning accuracy as the state-of-the-art CSIT-based solution, demonstrating the efficiency of our approach in combating the lack of CSIT. With the explosive increase in the number of connected devices at mobile edge networks, machine learning (ML) over a vast volume of data at edge devices has attracted considerable research attention.
A Bayesian Federated Learning Framework with Multivariate Gaussian Product
Federated learning (FL) allows multiple clients to collaboratively learn a globally shared model through cycles of model aggregation and local model training without the need to share data. In this paper, we comprehensively study a new problem named aggregation error (AE), arising from the model aggregation stage on a server, which is mainly induced by the heterogeneity of the client data. Due to the large discrepancies between local models, the accompanying large AE generally results in a slow convergence and an expected reduction of accuracy for FL. In order to reduce AE, we propose a novel federated learning framework from a Bayesian perspective, in which a multivariate Gaussian product mechanism is employed to aggregate the local models. It is worth noting that the product of Gaussians is still a Gaussian. This property allows us to directly aggregate local expectations and covariances in a definitely convex form, thereby greatly reducing the AE. Accordingly, on the clients, we develop a new Federated Online Laplace Approximation (FOLA) method, which can estimate the parameters of the local posterior by repeatedly accumulating priors. Specifically, in every round, the global posterior distributed from the server can be treated as the priors, and thus the local posterior can also be effectively approximated by a Gaussian using FOLA. Experimental results on benchmarks reach state-of-the-arts performance and clearly demonstrate the advantages of the proposed method.
Federated Machine Learning for Intelligent IoT via Reconfigurable Intelligent Surface
Yang, Kai, Shi, Yuanming, Zhou, Yong, Yang, Zhanpeng, Fu, Liqun, Chen, Wei
Intelligent Internet-of-Things (IoT) will be transformative with the advancement of artificial intelligence and high-dimensional data analysis, shifting from "connected things" to "connected intelligence". This shall unleash the full potential of intelligent IoT in a plethora of exciting applications, such as self-driving cars, unmanned aerial vehicles, healthcare, robotics, and supply chain finance. These applications drive the need of developing revolutionary computation, communication and artificial intelligence technologies that can make low-latency decisions with massive real-time data. To this end, federated machine learning, as a disruptive technology, is emerged to distill intelligence from the data at network edge, while guaranteeing device privacy and data security. However, the limited communication bandwidth is a key bottleneck of model aggregation for federated machine learning over radio channels. In this article, we shall develop an over-the-air computation based communication-efficient federated machine learning framework for intelligent IoT networks via exploiting the waveform superposition property of a multi-access channel. Reconfigurable intelligent surface is further leveraged to reduce the model aggregation error via enhancing the signal strength by reconfiguring the wireless propagation environments.